Skip to content

Conversation

daixque
Copy link
Contributor

@daixque daixque commented Aug 29, 2024

This PR adds a documentation for new Lucene's filters for Japanese text under analysis-kuromoji plugin.

New filters are:

  • hiragana_uppercase
  • katakana_uppercase

These filters are introduced to Lucene 9.11 and also it's available on Elasticsearch from 8.15.

This is related to:

New filters are:
 - hiragana_uppercase
 - katakana_uppercase

This is related to:
* elastic#106553
@daixque daixque added >docs General docs changes Team:Docs Meta label for docs team v8.15.0 labels Aug 29, 2024
Copy link
Contributor

Documentation preview:

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-docs (Team:Docs)

@elasticsearchmachine elasticsearchmachine added v8.16.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Aug 29, 2024
Copy link
Contributor

@leemthompo leemthompo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I tested that this works as documented. I have one minor non-blocking question about wording.

Thanks for opening PR and adding to the docs 🏅

[[analysis-kuromoji-hiragana-uppercase]]
==== `hiragana_uppercase` token filter

The `hiragana_uppercase` token filter normalizes small letters (捨て仮名) in hiragana into normal letters.
Copy link
Contributor

@leemthompo leemthompo Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "normal letters" accepted phrasing?

Maybe "The hiragana_uppercase token filter normalizes small Hiragana letters (捨て仮名) into full-size Hiragana letters? "

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, maybe standard (or regular) would be better. The word "Full-size" sounds like "full-width" (multi-byte), which is not the case here. Let me change "normal" to "standard".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! Glad I was able to communicate that despite my ignorance of linguistic terms :)

[[analysis-kuromoji-katakana-uppercase]]
==== `katakana_uppercase` token filter

The `katakana_uppercase` token filter normalizes small letters (捨て仮名) in katakana into normal letters.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as above.

@leemthompo
Copy link
Contributor

🚢

@leemthompo leemthompo merged commit 2982fc6 into elastic:main Sep 4, 2024
5 checks passed
leemthompo pushed a commit to leemthompo/elasticsearch that referenced this pull request Sep 4, 2024
@leemthompo
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.15

Questions ?

Please refer to the Backport tool documentation

elasticsearchmachine pushed a commit that referenced this pull request Sep 4, 2024
cbuescher pushed a commit to cbuescher/elasticsearch that referenced this pull request Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>docs General docs changes external-contributor Pull request authored by a developer outside the Elasticsearch team Team:Docs Meta label for docs team v8.15.0 v8.16.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants